Decreasing COPD-related incidences and hospital admissions in a German health insurance population

Chronic obstructive pulmonary disease (COPD) is associated with smoking and work-related health hazards. Most studies have reported prevalences, and the number of studies examining incidences and social inequalities is small. We analyzed the development of social inequalities of COPD-incidences in terms of income and exacerbations in terms of hospital admissions. Findings were based on claims data from a German statutory health insurance covering 2008 to 2019. Outpatient diagnoses were used for defining COPD-cases, hospital admissions were used for detecting exacerbations. Analyses were performed using Cox-regression. Individual incomes were depicted at three levels defined according to national averages for each year. Data of 3,040,137 insured men and women were available. From 2008 to 2019 COPD-incidences in men decreased by 42% and 47% in women. After stratification by income the reduction at the lowest income level was 41% and 50% in women. Respectively, at the highest income level reductions were 28% and 41%. Disease exacerbations decreased over time, and also social inequalities between income groups emerged. COPD-rates decreased over time at all income levels, but at a faster pace in the lowest income group, thus leading to a positive development of diminishing social gradients in men as well as in women.

Against the backdrop of the considerations above we conducted a trend study to examine COPD-incidences as diagnosed based on spirometry using routinely collected health insurance data.The main topic refers to the long-term development of incidences by focusing on social disparities in terms of income.Finally, exacerbations as depicted by hospital admissions are considered.
Our overall aim was to explore trends in COPD-incidence over time.This should be achieved by pursuing four sub-objectives: • To examine whether COPD-incidence rates are increasing or decreasing over time.Against the backdrop of decreasing tobacco consumption we expected also COPD-rates to decrease.• To examine whether changing COPD-rates are differing by social position.Socio-economic position is depicted by income.• To examine whether the two developments as formulated above are occurring in men as well as in women.
• Finally, to examine whether the abovementioned changes also apply to hospital admissions as indicators of exacerbated disease states.

Data and methods
Our study is a trend study with analyses based on the complete set of claims data of a large German statutory health insurance covering all individuals aged 18 years and older insured in the calendar years 2008 to 2019.
No further exclusion criteria were applied.Outpatient diagnoses were used for defining COPD-cases.AOKN insures around one third of the population in the Land of Lower Saxony.The data were drawn from the Allgemeine Ortskrankenkasse Niedersachsen (AOKN).We obtained the basis for our study as an anonymized and de-identified dataset as prepared by the AOKN.In Germany, health insurance is mandatory for all citizens, and about 90% of the population has coverage under the regulation of the statutory health insurance system with premiums depending on income 21 .Statutory health insurances are operating under public law, and health care plans are defined uniformly according to nationwide regulations.
As the data of only one statutory health insurance were available, it had to be examined whether and to what extent the insurance population differs from the German population.This topic was examined in two studies that compared the health insurance data with national figures as issued by the German statistical office and from the Federal Employment Agency.
The first study covered the years 2004 and 2009.The proportions of men and women as well as the distribution of age did not differ from the German population.The proportion of employed individuals of men also did not differ, but in women the employment rate was lower in the insurance population.In the latter the proportion of employed in lower qualified positions was higher 22 .The second comparison covered the years 2012 and 2017.It turned out that the sex distribution did not differ between the health insurance and the German population.The proportion of individuals below 30 years of age was higher among the insured, and again a higher share of the insurance population was employed in jobs requiring lower qualification levels 23 .Nevertheless, due to the high number of individuals the insurance data covered the full range of occupations within the realm of the statutory health insurance system, i.e. individuals beyond a certain upper income limit, civil servants and officials were not represented in our data.
Routinely collected health insurance data are mainly used for accounting purposes.Thus, health-related and other behaviours as well as out-of-the pocket medications remain unrecorded as long as they are not associated with a diagnosis that is moving money.Health insurance data are left-, right or interval-censorized.Thus, beginnings or ends of observation periods may be rather arbitrary.For prevalent cases this is less important, but incidences need to be dated precisely.In order to prevent that prevalent cases are erroneously misclassified as incident ones, we counted cases only if an incident followed after an observation period of at least one year without a diagnosis.
COPD-diagnoses were coded by physicians and classified according to ICD-10: J44.As in the German health care system physicians are allowed to assign all diagnoses, we counted only cases classified as "assured" and that cases were diagnosed by internists, family physicians or pulmonologists.In order to avoid falsely assigned diagnoses, a case was counted only if a COPD-diagnosis was assigned at least twice within four quarterly periods following the first diagnosis, thus leading to underestimated COPD-rates.
The ICD-10 classification system permits to classify disease severity by coding the last two digits, but in practice the coding was not always available with sufficient accuracy.Disease exacerbations were depicted by hospital admissions 3 .We used the ICD-code and the date of admission.As for outpatient data, the episode recorded first was used for classifying cases as exacerbated.

Identification of incident cases
Our dataset was longitudinal, thus making it possible to identify incident cases.As the data were left-, right-, and interval-censored, most cases recorded in the data are prevalent.In order to separate prevalent from incident cases and to minimize the number of cases incorrectly classified as incident, all cases from newly insured individuals were defined as prevalent.If a case was recorded following an individual year of observation without a COPD-diagnosis it was classified as incident.In the subsequent years such cases were classified as prevalent.This procedure was motivated by the attempt to identify as many correctly classified incident cases as possible.As a side-effect, some incident cases occurring in the first year of observation may be classified as prevalent, thus yielding conservative disease rates.

Income
Income classifications were based on insurance premiums that are linearly dependent on individual income.Income was classified into categories whose cut-offs were determined according to the German average annual income of the former Western federal states of Germany in terms of the pretax salary of employed individuals as reported annually by the Federal Statistical Office of Germany (Statistisches Bundesamt).In order to make incomes of pensioners more comparable to the average German annual salary of working individuals, the average annual income was adjusted for unemployment and payments to pension funds that all employed individuals have to pay.For each year of observation, the corresponding adjusted average annual income served as a reference, and income was classified as deviation from the average annual income.Finally, individual-based income from employment as explained above was classified into three levels: 1) less than 40% of the national income, 2) 40-80% of the mean national income, and 3) more than 80% of the national income.Individuals for whom no income data were available were classified as missing.They are a heterogeneous group consisting of employed insured without income information, self-employed, family insured, pensioners, and students.
Besides income, the health insurance data are also containing education and occupation, but these are only available for employed insured.Other indicators of socio-economic status (SES), such as quality of residence, number of children or family members or working hours are unavailable.

Statistics
All analyses were performed by means of Cox proportional hazards regression 24 .This model is appropriate here as it permits it take covariates, length of observation and censoring into account.At the first step gender differences in the incidence of COPD are considered, and income as indicator of social differentiation and age are included.The main interest of our research questions referred to changes over time as depicted by the variable "year".If multivariate models are estimated, effects of a given variable have to be interpreted as controlled for the other variables in the model.Thus, income effects have to be interpreted as estimations taking all years together.At the second step analyses are extended by stratifying by income levels, and at the final step hospitalizations as indicator of disease exacerbation are considered in separate analyses.STATA version 16.1 was used for performing the statistical analyses 25 .

Ethics approval
Our study is based on claims data, that is, on routinely collected data.No experiments involving humans were performed.We obtained the basis for our study as an anonymized and de-identified dataset.Data preparation was performed by the AOKN.The use of this sort of data for scientific purposes is regulated by federal law, and the data protection officer of the General Local Statutory Health Insurance of Lower Saxony (AOKN) has approved its use.

Results
The total number of subjects in the dataset was N = 3,040,137 with 1,481,485 (= 48.7%) men and 1,558,652 (= 51.3) women.The data covered all insured men and women aged 18 years and older.Tables 1 and 2 are depicting the basic figures of the variables included in the regression analyses.Annual incidence rates were below 1% and decreasing over time while the size of the insurance population was growing.Only a small proportion of patients with COPD was admitted to hospital, and these patients were considerably older when admitted to hospital than at the date of recorded onset (Tables 1 and 2).Kaplan-Meier survival estimates for incidences and for incidencerelated hospital admissions and age-standardized incidence rates are documented in the appendix as additional information (see supplementary Figs. 1 and 2 and supplementary table 1).

Regression analyses
Analyses with income, year of observation, and age (Table 3).
At the first step regression analyses with gender and age as covariates were performed in order to decide whether it was appropriate to differentiate between men and women.after controlling for income, a hazard ratio (HR) of HR = 0.81 (p < 0.001; 95%-confidence interval: 0.80-0.82)emerged, i.e. men were more likely to obtain a COPD-diagnosis than women.The numeric HR also remained unchanged after having extended the model and after having included year of diagnosis as independent variable.No separate table is displayed for these findings.
In men COPD-rates were decreasing over time, although a monotonous decrease emerged only in the years 2013 and later.Finally, compared with 2008 as reference year, the rates of 2019 dropped to a level of 58%.A social www.nature.com/scientificreports/gradient in terms of income emerged as the rate of the intermediate and the highest income category decreased to 73% and 63% compared with the lowest category.In women the rates were also decreasing over time.A social gradient in terms of income emerged also in women.The gradient was however somewhat smaller than in men.At the second step social disparities for obtaining a COPD-diagnosis were considered after stratification by income.Table 3 revealed that the hazard ratios for obtaining a diagnosis decreased with increasing income level, i.e., over the whole observation period a significant social gradient emerged.With respect to time, the HRs were indicating decreasing risks for a COPD-diagnosis as time comes closer to the present, although monotonously decreasing HRs emerged only after 2013, and this held for men as well as for women.In men as well as in women the hazard ratio for the year 2010 was higher than in the preceding and in the following year.
The proportional hazards-assumption was tested based on Schoenfeld residuals.At first the whole model as depicted in Table 3 was considered, and it turned out that, with the exception of the year 2019, the tests for years of observation were not statistically significant.After Bonferroni-correction the overall test also turned out as statistically insignificant.If the analyses were confined to year of observation, the proportional hazardsassumption was fulfilled, and the effects for year of observation were stronger.For income, the results were mixed.For the male and for the female study population the tests were insignificant in the multivariate model after Bonferroni-correction, if income was tested alone, the proportional hazards-assumption was not fulfilled.For age the assumption was not met either.

The development of COPD-rates over time stratified by income
At the third step stratified analyses were performed for each income level in order to examine whether rates are decreasing at all levels (Table 4).
In men the rates of the lowest income group were decreasing from 2008 to 2019, and finally the 2019-rate dropped down to 59% as compared to 2008 as reference year, although the development before 2013 was not steady.The same held for the intermediate income group with incomes between 40 and 80% of the national level.This holds only partly for the highest income group, where rates also dropped, but the rates changed inconsistently, and only for the last four years they were statistically significant.In the group of men where no income information was available the rates also dropped over time.
In women a consistent decrease of COPD-rates emerged in the lowest income group, and in 2019 the COPDrate dropped down to 50% of the rate in 2008.The same development occurred in the intermediate and in the highest income group reaching 62% and 59% of the 2008-rate.As reported for men, the rates in the high-income group developed inconsistently, and the differences to 2008 became statistically significant only after 2015.The development of the group without income data were also following a downward trend.
As already mentioned in the first regression analyses, the HRs for 2010 were higher than in the preceding and in the following year.In the stratified analyses this was found again.In men this applied to the highest, in women in the intermediate income group.
If the long-term development of hazard ratios by income in men are considered, the developments of the lowest and the intermediate income groups were developing in a similar way.The figures were developing downwardly, and the development of unclassified subjects was also pointing towards the same direction.The exception was the highest income group that started with the lowest proportions of COPD, and in this group the smallest decrease occurred.While the lowest income group ended with 53% of the reference year, the respective Table 3. COPD-incidences by year of observation and income in men and women: Hazard ratios (HR), standard errors and confidence intervals.The proportional hazards-assumption was tested for every regression analysis depicted in Table 4.In no case the test turned out to be significant for year of observation.With the exception of the intermediate income group in women, the tests for age were again statistically significant.

Hospitalizations
As displayed in Tables 1 and 2, for newly diagnosed COPD-patients the number of inpatient cases had decreased over time.If analyzed by means of regression analyses, hospitalization risks of men and of women were decreasing over the observation period 2008 to 2019, reaching 47% of the rate in 2019 (Table 5).In a separate analysis without stratification (details not shown in a table) rates decreased continuously, but there was a marked gender difference as age-controlled rates in women were 47% lower than in men (HR = 0.53; p < 0.001; 95%CI = 0.52-0.54).Apart from decreasing rates over time, also social differences in terms of income emerged, and this held in women as well as in men.In the latter the incidence rate in the lowest income group was 76% lower than in the group with the highest income.In women income differences were smaller, but nevertheless statistically significant.
The proportional hazards-assumption was again fulfilled for year of observation, and this also holds for all income strata after Bonferroni-correction for men as well as for women.For age it was not fulfilled.

Discussion
Our study focused on time trends and social inequalities in the incidence of COPD.It broadens existing knowledge in several ways: No studies on incidences have been published so far as earlier work was focused on prevalence rates, and ours is also the first one study to use data from Germany.The findings were based on a longitudinal dataset from a large statutory health insurance, thus making it possible to analyze changes occurring over a longer time period.In contrast to survey-based studies our data were covering a complete population where health-related nonresponse does not occur.In addition, large case numbers are leading to smaller confidence intervals, and they are reducing the risk of falsely overestimating outliers.
We found that incidence rates were decreasing over time, women had lower rates than men, and COPD-rates differed by income level.Smoking rates in Germany decreased in men and in women, but only in men tobaccoassociated cancer rates were reported to having decreased significantly 26,27 .It needs to be discussed how changing rates in our data can be interpreted.Over the 11 years covered the COPD-incidence in men dropped by 42% and 47% in women.These magnitudes are within the ranges reported in earlier studies.For Spain the prevalence rates of moderate to severe COPD (according to GOLD-standards) were reported to having decreased by 50% over a decade 4 , and for Sweden a reduction of 30% prevalent cases were reported for 1994 to 2009 2 .Comparing these studies with our findings yields only an incomplete picture because they reported prevalence rates while our study was based on incidences.A methodologically sound Swedish study examined incidence rates over an observation period of 10 years.It started with subjects with non-serious respiratory symptoms and compared three birthcohorts (1919/1920, 1934/1935 and 1949/1950).After a questionnaire-based screening COPD was confirmed by spirometry.The initial screening made it possible to record also risk factors and preclinical symptoms.Finally, depending on the criteria applied, cumulative 10-year incidences of 8.2% and 13.5% were reported 14 .
In the results-section it was reported that the HRs for 2010 did not fit into the general downward development of incidence rates.Although a concluding explanation is difficult, we assume that this may be due to reporting routines within the health insurance, i.e., in a large number of cases delayed reporting might have occurred, thus cases appeared only in the following year.
Our data do not include health-related behaviours and symptoms that are not presented to a physician, but they permitted to study the development of hospitalizations as indicators of exacerbations.The proportion of subjects admitted to hospital finally turned out to be small.COPD-rates also turned out to increase with decreasing income level.So far, our findings on social inequalities in COPD are in line with findings pertaining to other smoking-related diseases with the most prominent example being lung cancer 9,12,28,29 .Separate analyses by income level revealed that in the highest income group no or only minor decreases in individual rates occurred.Significant changes were found only in the intermediate and in the lowest income group.In sum, this development led to a decrease of social gradients mainly in men that was driven by the intermediate and the lowest income group.A report from the USA came to different conclusions.The lowest socioeconomic groups were also driving changing social gradients, but in contrast to our findings the disease rates in the lowest socio-economic group had increased, thus resulting in widening income-related health disparities 5 .The study setting was however different from ours since COPD had not been diagnosed by spirometry, and also other respiratory diseases and symptoms had been recorded.In the literature decreasing COPD-rates were explained by smoking as the most important risk factor that also has a social gradient.In Germany the consumption of nicotine has decreased in the last decades as between 1991 and 2016 the sale of cigarettes had dropped by 30%.Also among smokers the proportion of heavy consumers was reduced by about 50% from 1998 to 2014 30 .Between 1999 and 2013 the reduction of tobacco consumption in Germany was mainly driven by developments in the population holding middle and high socioeconomic positions 31 .Regardless of a reduction of nicotine consumption, individuals in lower socioeconomic positions might also be particularly affected by occupation-related exposures such as dust, chemicals, vapours and fumes 32 , but there is evidence that significant improvements following workplace interventions in Europe have taken place 33 .Reductions of workrelated health hazards are mainly directed towards improving conditions of men holding lower occupational positions what may explain the favourable developments in lower income groups.
In our study we used only income as indicator of socio-economic status 34 , because it had the lowest number of missing values.In health insurance data information on education and occupation are available only for employed insured, and considering these indicators would have led to higher numbers of missings.Thus, we considered only one aspect of social differentiation, although education and occupation as two more prominent and frequently used SES-indicators are depicting different aspects of socio-economic position 35,36 .An alternative approach might be to use classifications of occupations.This approach does however not permit to characterize workplaces with respect to related hazards with sufficient precision, and this type of classification must also take long-term changes of expositions into account.It might be discussed whether the application of age, period and cohort models are a useful extension of our analytic strategy 37 .We finally decided against it because our interest was directed towards comparing the development of COPD only over different time periods at population level.Using age, period and cohort models would also have given rise to the often discussed issue whether period and cohort effects can be separated 38 .
Our study has some potentially limiting conditions that have to be discussed.Our incidence rates are underestimated because diagnoses in the first year of observation remained undetected because falsely identified incident cases should be excluded as far as possible, thus leading to conservative incidence rates.A second source of underestimation may be that COPD-cases had only been diagnosed on the basis of spirometry without also having considered patient-reported symptoms as indicators.The decision not to use symptoms was based on the last update on the GOLD-report from 2022 39 .According to these guidelines, also patient-reported symptoms may be used for making a COPD-diagnosis, but the final decision must be backed by spirometry.Apart from hospital admissions, disease severity could not be determined because the recorded ICD-10-diagnoses were lacking sufficient precision.Unfortunately, claims data do not include behavioural measures, and this refers to the abovementioned risk factors, e.g.smoking or the presence of workplace hazards such as dust, gas, fumes, chemical agents and urban air pollution that can only be drawn from other types of data 3,40 .Although for Europe directives on safety and health at work (https:// osha.europa.eu/ en/ europ ean-stand ards) have been issued, data are either unavailable or published only at cross-sectional level 41 .Nevertheless, we can conclude that social gradients in terms of COPD-incidences were decreasing over time, and this development was mainly due to the middle and income groups what makes the development of Germany different from rate changes reported for North America.Taken together, our findings are indicating a positive change in the area of health inequalities, and analyses of data pertaining to the following years will reveal whether this development will continue.

•
All methods were carried out in accordance with relevant guidelines and regulations.In the present case this applies to the STROBE guidelines.• The data used for our study were provided by the Allgemeine Ortskrankenkasse Niedersachsen (AOKN-Local statutory Health insurance of Lower Saxony).No ethical approval by a local ethics committee was necessary.Data use was permitted according to an agreement between the first author and the AOKN.As a part of this agreement the data protection officer of the Local Statutory Health Insurance of Lower Saxony (AOKN) has approved the data use use for scientific purposes.• No individual informed consent to use the data was required.No experiments or surveys were performed for collecting the data.

Table 1 .
COPD-incidences for men with age at diagnosis, hospital admissions and income levels as indicator of socioeconomic position at an annual basis; no percentages for total numbers are given because cases are included in more than one year, depending on length of insurance periods.

Table 2 .
COPD-incidences for women with age at diagnosis, hospital admissions and income levels as indicator of socioeconomic position at an annual basis; no percentages for total numbers are given because cases are included in more than one year, depending on length of insurance periods.

Table 4 .
COPD-incidences by year of observation and income in women and men.figure in the intermediate income group was 49%.The lowest decrease occurred in the highest income category with 70% of the reference year.However, it has to be borne in mind that in the latter group the proportion of individuals with COPD was lowest.

Table 5 .
Hospital admissions due to a main diagnosis of COPD by year of observation and income level in men and women: Hazard ratios, standard errors and confidence intervals.